Glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate synthase and the gene encoding the same

ABSTRACT

The present invention relates to a novel 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS). It is highly tolerant to glyphosate, the competitive inhibitor of the substrate phosphoenolpyruvate (PEP). The invention also relates to a gene encoding the synthase, a construct and a vector comprising said gene, and a host cell transformed with said construct or vector.

TECHNICAL FIELD

The present invention relates to a novel glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS), and isolated nucleic acid sequence encoding the synthase, a nucleic acid construct comprising said sequence or the coding region, a vector carrying said sequence or the coding region or said nucleic acid construct, and a host cell transformed with said construct or vector.

BACKGROUND

The 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) is a key enzyme involved in the aromatic amino acid synthesis pathway in plants and bacteria. Glyphosate, which is also referred to as N-phosphylmethyl glycine, is a broad-spectrum, highly efficient post-sprouting herbicide. Glyphosate is a competitive inhibitor of phosphoenolpyruvate (PEP), which is one of the substrates of EPSPS. Glyphosate block the conversion of PEP and 3-phosphate-shikimate to 5-enolpyruvul 3-phosphate-shikimate catalyzed by EPSPS, thereby block the synthesis pathway of shikimic acid, a precursor for the synthesis of aromatic amino acids, and lead to the death of plants and bacteria.

The glyphosate tolerance of a plant may be obtained by stably introducing a gene encoding glyphosate-tolerant EPSPS to the plant genome. There are mainly two classes of known glyphosate-tolerant EPSPS genes: Class I (see, e.g. U.S. Pat. No. 4,971,908; U.S. Pat. No. 5,310,667; U.S. Pat. No. 5,866,775) and Class II (see, e.g. U.S. Pat. No. 5,627,061; U.S. Pat. No. 5,633,435). These genes have been successfully introduced into plant genomes, and the glyphosate-tolerant plant cells and plants are obtained.

The invention is aimed to find a novel EPSPS of native sequence that is tolerant to glyphosate.

SUMMARY

An object of the invention is to provide a newly isolated nucleic acid sequence encoding glyphosate tolerant EPSPS protein.

A further object of the invention is to provide a novel glyphosate-tolerant EPSPS protein.

A further object of the invention is to provide a nucleic acid construct, formed by operably linking the above-said nucleic acid sequence to a control sequence essential for expressing said nucleic acid sequence in a selected host cell. Specifically, the control sequence includes optionally a promoter, an enhancer, a leader sequence, a polyadenylation signal, and the initiation and termination sequences for transcription and translation.

A further object of the invention is to provide a vector comprising the above-said nucleic acid sequence or nucleic acid construct.

A further object of the invention is to provide a host cell transformed with the above-said construct or vector. Said host cell may express the protein encoded by the above-said nucleic acid sequence under an appropriate condition, and enables said protein to exhibite enzymatic EPSPS activity and glyphosate tolerance, thereby the host cell obtains the glyphosate tolerance.

A further object of the invention is to provide the above-said host cell and progeny cells hereof. Said cells contain the aforesaid nucleic acid sequence or the coding region, or the nucleic acid construct or vector, and are glyphosate-tolerant.

The other purposes of the invention are illustrated in the following description and examples.

DISCLOSURE OF THE INVENTION

The invention provides a novel glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) having a native sequence. The term “native synthase” denotes a sequence which is not modified by mutagenesis, or by biological or chemical modifications, such as genetic engineering. The invention provides an isolated amino acid sequence of said EPSPS (SEQ ID NO:3). Any amino acid sequence which is modified by deletion, addition and/or substitution of one or more amino acid residues in SEQ ID NO:3 is included in the scope of the invention, provided that the modified sequence encodes a protein of EPSPS activity and glyphosate tolerance.

The invention further provides an isolated nucleic acid sequence encoding said EPSPS (SEQ ID NO:2, in particular the coding region). Any nucleic acid sequence which is modified by deletion, addition and/or substitution of one or more nucleotides in SEQ ID NO: 2 is included in the scope of the invention, provided that the modified sequence encodes a protein of EPSPS activity and glyphosate tolerance.

The nucleic acid construct of the invention is constructed by operably linking the EPSPS-encoding nucleic acid sequence of the invention with other homologous or heterologous sequence.

According to the method of the invention, the nucleic acid construct or the isolated nucleic acid sequence of the invention is incorporated into a vector, and a selected host cell is transformed with said vector. The EPSPS enzyme of the invention is expressed. The recombinant host cell is thus conferred with a tolerance to glyphosate. Alternatively, instead of transformation with a vector, the isolated nucleic acid sequence or the nucleic acid construct of the invention is introduced into a host cell directly by conventional methods such as electroporation. The EPSPS enzyme of the invention is expressed and the host cell is conferred with glyphosate tolerance. The vector and the recombinant host cell thus obtained, as well as the method for obtaining the cell are included in the scope of the invention.

DEFINITIONS

The term “percent (%) sequence homology” used herein refers to the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in the target sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence homology, and not considering any conservative substitutions as parts of the sequence homology. The sequences herein include amino acid sequences and nucleotide sequences. The determination of percent (%) sequence homology may be achieved in various ways that are within the skill in the art, for example, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR). Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithm needed to achieve maximal alignment over the full-length of the sequences being compared.

The term “nucleic acid construct” used herein refers to a single-stranded or double-stranded nucleic acid molecule, which is isolated from a native gene, or is modified to combine nucleic acid fragments in a manner not existing in nature. When the nucleic acid construct contains all the control sequences required for expressing the EPSPS of the invention, the term “nucleic acid construct” is synonymous to the term “expression cassette”.

The term “control sequence” used herein comprises all components, which are essential or advantageous for the expression of a polypeptide of the present invention. Each control sequence may be native or foreign to the nucleic acid sequence encoding said polypeptide. Such control sequences include, but are not limited to, a leader sequence, polyadenylation sequence, a propeptide sequence, promoter, and transcription terminator. The control sequences include at least a promoter and termination signals for both transcription and translation. A control sequence may be provided with a linker to introduce specific restriction sites facilitating the ligation of the control sequence with the coding region of the nucleic acid sequence encoding a heterologous polypeptide.

The term “operably linking” denotes the linkage of the isolate nucleic acid sequence of the invention to any other sequences homologous or heterologous to said sequence, so that they together encode a product. When necessary, said sequences may be separated by methods such as restriction digest. The homologous or heterologous sequence as disclosed herein could be any sequence, such as any control sequence, that directs the expression of the isolated nucleic acid sequence of the invention in a selected host cell, or the sequence coding for a fusion protein together with the isolated nucleic acid sequence of the invention.

The term “host cell” used herein refers to any cell capable of receiving the isolated nucleic acid sequence of the invention, or receiving the construct or vector comprising said sequence, and keeping them stable therein. The host cell will obtain the feature determined by the isolated nucleic acid sequence of the invention when the cell contains said sequence.

DETAILED DESCRIPTION

The inventors surprisingly discover and isolate a novel glyphosate tolerant EPSPS coding gene. The coding sequence of said gene (SEQ ID NO:2) and the coding region (CDS, the nucleotides 574–1803) are disclosed herein. The amino acid sequence (SEQ ID NO:3) encoded by said CDS is also disclosed herein. DNAMAN version 4.0 is used with a CLUSTAL format to align the amino acid sequence as shown in SEQ ID NO:3 with the known sequences of EPSPS Classes I and II. It is found that the sequence of the invention does not comprise any sequence claimed by preceding patents (see FIG. 2). BLAST search in GenBank protein sequence bank shows that the SEQ ID NO:3 of the invention is 37% homologous with the EPSPS amino acid sequence derived from Clostridium acetobutylicum, and is 20% homologous with the EPSPS amino acid sequence derived from E.coli (see FIG. 2). NCBI-BLAST search indicates that the sequence as shown in nucleotides 574–1803 of SEQ ID NO:2 is not found to be homologous with any known nucleic acid sequence. Therefore, the nucleic acid sequence and the protein described in the invention are novel.

Hence, one aspect of the invention relates to an isolated nucleic acid sequence, which comprises the nucleic acid sequence shown in SEQ ID NO:2 and encodes a glyphosate tolerant EPSPS.

The invention also relates to an isolated nucleic acid sequence as shown in nucleotides 574–1803 of SEQ ID NO:2, which encodes a glyphosate tolerant EPSPS.

The invention also relates to a nucleic acid sequence obtained by modifying one or more nucleotides or by deleted and/or added 3 or a multiple of 3 nucleotides in SEQ ID NO:2 or in the sequence of nucleotides 574–1803 of SEQ ID NO:2, and said nucleotide sequence is capable of coding a protein with the activity of 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) and the glyphosate tolerance. The modification, deletion and addition of nucleotides are conventional within the skills in the art.

The invention also relates to a nucleotide sequence which is of the homology, such as the homology of at least about 65%, preferably at least about 66%, preferably at least about 67%, preferably at least about 68%, preferably at least about 69%, preferably at least about 70%, preferably at least about 71%, preferably at least about 72%, preferably at least about 73%, preferably at least about 74%, preferably at least about 75%, preferably at least about 76%, preferably at least about 77%, preferably at least about 78%, preferably at least about 79%, more preferably at least about 80%, more preferably at least about 81%, more preferably at least about 82%, more preferably at least about 83%, more preferably at least about 84%, more preferably at least about 85%, more preferably at least about 86%, more preferably at least about 87%, more preferably at least about 88%, more preferably at least about 89%, more preferably at least about 90%, more preferably at least about 91%, more preferably at least about 92%, more preferably at least about 93%, more preferably at least about 94%, more preferably at least about 95%, more preferably at least about 96%, more preferably at least about 97%, more preferably at least about 98%, most preferably at least about 99%, with the nucleic acid sequence defined by SEQ ID NO:2 or nucleotides 574–1803 of SEQ ID NO:2. Such a sequence is within the scope of the invention provided that it encodes a protein having the glyphosate tolerant EPSPS activity.

The isolated nucleic acid sequence of the invention may be cloned from nature with methods disclosed in Examples of the invention. The cloning method may comprise the steps of isolating and restrictively digesting a nucleic acid fragment comprising the nucleic acid sequence encoding the protein of interest, inserting said fragment to a vector, and incorporating the vector into a host cell, whereby copies or clones of said nucleic acid sequence are duplicated in said host cell. However, it may be easier to synthesize the sequence with an automatic nucleotide synthesizer (such as ABI394 DNA synthesizer of Applied Biosystems) according to the nucleotide sequence disclosed herein, or to synthesize separately the fragments of the nucleic acid sequence and to ligate the fragments into a full-length sequence with conventional ligases and vectors using the method of Chinese Patent application 99103472.4, which is disclosed on Oct. 4, 2000.

The nucleic acid sequence of the invention may be genomic, cDNA, RNA, semi-synthesized, completely synthesized sequence, or any combination thereof.

Another aspect of the invention relates to an isolated nucleic acid sequence encoding a protein with EPSPS activity and glyphosate tolerance, wherein the protein comprises the amino acid sequence as shown in SEQ ID NO:3.

The invention further relates to an isolated nucleic acid sequence encoding a protein with EPSPS activity and glyphosate tolerance, wherein the amino acid sequence of the protein comprises substitution, deletion and/or addition of one or more amino acid residues in the amino acid sequence of SEQ ID NO:3, while the EPSPS activity and glyphosate tolerance are remained. Said substitution, deletion and/or addition of one or more amino acid residues are within the conventional technique in the art. Such a change of amino acids is preferably a minor change of features which is a conserved amino acid substitution without prominent influence to the folding and/or activity of the protein; a minor deletion of generally about 1–30 amino acids; a minor extension at amino terminus or carboxyl terminus, such as an extension of one methionine residue at the amino terminus; a minor linker peptide in length of, for example about 20–25 residues.

Examples of conserved substitutions are those occured within the following amino acid groups: basic amino acids (e.g. Arg, Lys and His), acidic amino acids (e.g. Glu and Asp), polar amino acids (e.g. Gin and Asn), hydrophobic amino acids (e.g. Leu, Ile and Val), aromatic amino acids (e.g. Phe, Try and Tyr), and small molecular amino acids (e.g. Gly, Ala, Ser, Thr, and Met). Amino acid substitutions which usually do not change a particular activity are known in the art, and have been described by N. Neurath and R. L. Hill, Protein, published by New York Academic Press, 1979. The most common substitutions are Ala/Ser, Val/Ile, Asp/Glu, Thr/Ser, Ala/Gly, Ala/Thr, Ser/Asn, Ala/Val, Ser/Gly, Tyr/Phe, Ala/Pro, Lys/Arg, Asp/Asn, Leu/Ile, Leu/Val, Ala/Glu, and Asp/Gly, and reverser substitutions thereof.

To those skilled in the art, it is obvious that such a substitution may occur at regions other than those important for function but produce an active polypeptide. To the polypeptide encoded by the isolated nucleic acid sequence of the invention, the amino acid residues which is important for function and which is thus selected not to be substituted, may be identified by methods known in the art, for example site-directed mutagenesis or alanine scanning mutagenesis (see, e.g. Cunningham and Wells, 1989, Science 244: 1081–1085). The latter technique comprises introducing a mutation into each positive-charged residue in the molecule, and determining the glyphosate-tolerant EPSPS activity of the mutated molecule, thereby determining the amino acid residue important for the activity. The site of substrate-enzyme interaction can also be determined by analysis of the 3D structure, which may be determined by techniques such as nuclear magnetic resonance, crystallography or light-affinity label (see, e.g. de Vos et al, 1992, Science 255: 306–312; Smith et al, 1992, J. Mol. Biol 224: 899–904; Wlodaver et al, 1992, FEBS Letters 309: 59–64).

The invention also relates to an amino acid sequence of the homology, such as the homology of at least about 50%, at least about 60%, at least about 65%, preferably at least about 66%, preferably at least about 67%, preferably at least about 68%, preferably at least about 69%, preferably at least about 70%, preferably at least about 71%, preferably at least about 72%, preferably at least about 73%, preferably at least about 74%, preferably at least about 75%, preferably at least about 76%, preferably at least about 77%, preferably at least about 78%, preferably at least about 79%, more preferably at least about 80%, more preferably at least about 81%, more preferably at least about 82%, more preferably at least about 83%, more preferably at least about 84%, more preferably at least about 85%, more preferably at least about 86%, more preferably at least about 87%, more preferably at least about 88%, more preferably at least about 89%, more preferably at least about 90%, more preferably at least about 91%, more preferably at least about 92%, more preferably at least about 93%, more preferably at least about 94%, more preferably at least about 95%, more preferably at least about 96%, more preferably at least about 97%, more preferably at least about 98%, most preferably at least about 99%, with the amino acid sequence as shown in SEQ ID NO:3. Such a sequence is within the scope of the invention provided that a protein comprising said homologous sequence has the glyphosate-tolerant EPSPS activity.

The protein encoded by the isolated nucleic acid sequence according to the invention has at least 20%, preferably at least 40%, more preferably at least 60%, even more preferably at least 80%, even more preferably at least 90%, most preferably at least 100% of the EPSPS activity of the amino acid sequence as shown in SEQ ID NO:3.

The invention also relates to a nucleic acid construct comprising the above-defined nucleic acid sequence, or its coding sequence (e.g. nucleotides 574–1803 of SEQ ID NO:2).

The nucleic acid construct of the invention further comprises a control sequence essential to the expression of aforesaid sequence in a selected host cell. The control sequence is operably linked to the aforesaid isolated nucleic acid sequence in the nucleic acid construct.

The control sequence may be a promoter, including a transcriptional control sequence directing the expression of a polypeptide. The promoter may be any nucleic acid sequence having transcriptional activity in the selected cell, such as a mutated, truncated, or heterozygous promoter. Such a promoter may be obtained from a gene encoding an extracellular or intracellular peptide. Such a polypeptide may be or may not be homologous to the cell. Various promoters used in prokaryotic cells are known in the art.

The control sequence may also be an appropriate transcription terminator sequence, a sequence recognized by a selected host cell to terminate transcription. The terminator sequence is operably linked to the 3′ end of the nucleic acid sequence coding for polypeptide. Any terminator which is functional in the selected host cell may be used in the invention.

The control sequence may also be an appropriate leader sequence, a non-translated region of an mRNA important for the translation in cell. The leader sequence is operably linked to the 5′ end of the nucleic acid sequence coding for polypeptide. Any leader sequence which is functional in the selected host cell may be used in the invention.

The control sequence may also be a polyadenylation sequence. The sequence is operably linked to the 3′ end of the nucleic acid sequence, and when transcribed, is recognized by the cell as a signal to add polyadenosine residue to transcribed MRNA. Any polyadenylation sequence which is functional in the selected host cell may be used in the invention.

The nucleic acid construct may further comprise one or more nucleic acid sequences which encode one or more factors useful in directing the expression of a foreign polypeptide, such as a transcription-activating factor (e.g. a trans-acting factor), a partner protein and a processing protein. Any factor effective in host cells, in particular bacterial cells and plant cells may be used in the invention. The nucleic acid encoding one or more such factors is not always linked in tandem with the nucleic acid encoding a foreign polypeptide.

The nucleic acids and control sequences described above may be joined in a conventional vector, such as a plasmid or a virus, to produce a “recombinant expression vector” according to the invention, using a method well-known to the skilled in the art (see J. Sambrook, E. F. Fritsch and T. Maniatus, 1989, Molecullar Cloning, laboratory manual, 2^(th) ed, Cold Spring, N.Y.). The vector may have one or more convenient restriction sites. The choice of vector usually depends on the compatibility of a vector and the host cell being used. The vector may be linear or in close circle and it may be autonomously replicable as an extrachromosomal entity, whose replication is independent from the chromosome replication, e.g. a plasmid (an extrachromosomal element), a minichromosome, or an artificial chromosome. The vector may comprise any means for ensuring self-replication. Alternatively, the vector is integrated into the genome and replicated with the chromosome after being introduced into the cell. The vector system may be a single vector or plasmid, or two or more vectors or plasmids (altogether contain the nucleic acid sequence of interest), or a transposon.

For integration into the host cell genome, the vector may comprise additional nucleic acid sequences that direct the integration of said vector into the genome through homologous recombination. The additional nucleic acid sequences enable said vector to integrate into the genome at a precise position. To increase the possibility of integration into a precise position, the integration elements should preferably comprise a sufficient number of nucleic acids, for example 100–1500 base pairs, preferably 400–1500 base pairs, most preferably 800–1500 base pairs, which are highly homologous to the corresponding target sequences thereof to increase the possibility of homologous integration. Said integration element may be any sequence that is homologous to the target sequence in the genome of the cell. Moreover, said integration elements may be non-coding or coding nucleic acid sequences. Alternatively, said vector may integrate into the genome of the cell through non-homologous integration.

In condition of autonomous replication, the vector may further contain an origin of replication enabling said vector to replicate autonomously in bacterial cells and plant cells.

The invention also relates to a recombinant “host cell” comprising the nucleic acid sequence of the invention. The nucleic acid construct or vector comprising the nucleic acid sequence of the invention may be introduced into the host cell, so that the nucleic acid sequence of the invention is integrated into the chromosome or the vector is autonomously replicated, whereby the nucleic acid sequence of the invention is expressed stably by the host cell and makes the host cell glyphosate-tolerant.

The host cell may be a prokaryotic cell such as a bacterial cell, but more preferably a eukaryotic cell such as a plant cell.

The common bacterial cells include the cells of Gram positive bacteria such as Bacillus, or the cells of Gram negative bacteria such as Escherichia or Pseudomonas. In a preferable embodiment, the bacterial host cell is a cell of E.coli.

The introduction of a expression vector into a bacterial host cell may be achieved by protoplast transformation (see Chang and Cohen, 1979, General Molecular Genetics 168: 111–115), using competent cells (see Young and Spizizin, 1961, J. Bacteriol. 81: 823–829, or Dubnau and Davidoff-Abelson, 1971, J. Mol. Biol. 56: 209–221), electroporation (see Shigekawa and Dower, 1988, Biotech. 6: 742–751), or conjugation (see Koehler and Thome, 1987, J. Bacteriol. 169: 5771–5278).

DESCRIPTION OF FIGURES

FIG. 1 shows the map of plasmid pKU2004.

FIG. 2 shows the amino acid sequence alignment between EPSPS of Pseudomonas putida P. P4G-1 (SEQ ID NO:3) with various known EPSPSs SEQ ID NOs 13–19, 12, and 20–25 respectively. Sequences shown in box and shade are claimed in previous patents.

FIG. 3 shows the growth curve of E.coli XL1-BLUE MR in different glyphosate concentrations. Said E.coli XL1 -BLUE MR carries different EPSPS genes.

EXAMPLES Example 1 Isolation of Gylphosate-Tolerant Strains

The sample were taken from neighborhood of a glyphosate producing factory in Hebei Province, China, and were diluted and spread on mediums comprising glyphosate. A total of 48 strains are isolated with high tolerance and degradation ability to glyphosate. One strain of them, 4G-1, is able to grow on a medium with 400 mM glyphosate, and is resistant to 100 mg/L Ampicillin. Said strain is selected for further studies.

Example 2 Identification of Glyphosate-Tolerant Strains

a) Mini-Prep of the Total DNA of Strain 4G-1

Strain 4G-1 is inoculated in 3 ml of LB liquid medium with 100 mg/L Ampicillin, and cultured at 28° C. overnight while shaking. The culture is centrifuged at 12000 rpm and the pellet is resuspended in 0.5 ml of Solution I (10 mM NaCl, 20 mM Tris-Hcl pH 8.0, 1 mM EDTA). Protease K (Merck, Germany) and SDS are added to a final concentration of 10 μg/ml and 0.5%, respectively. The suspension is then mixed by careful reversion, and then left at 50° C. for more than 6 hrs. An equal volume of phenol is added. The mixture is carefully reversed and left at room temperature for 10 min. Then, the mixture is centrifuged at 12000 rpm at room temperature for 5 min. The supematant aqueous phase is drawn out with tips (Axy Gen, USA), and the pellet is reextracted with equal volume of phenol/chloroform. The supematant is added with 10% of 3M NaAC and 2.3 volume of ethanol for precipitation. The mixture is centrifuged at 12000 rpm at −10° C. for 25 min. After the supematant is discarded, the precipitate is washed with 500 μl of 70% ethanol and centrifuge at 12000 rpm for 1 min. After the supernatant is drawn off completely, the precipitate is dried in Savant for 20 min or in the incubator at 37° C. for 1 hr. The precipitate is added with 100 μl of TE solution (10 mM Tris-Cl, 1 mM EDTA, pH 8.0) for solubilization, then frozen at −20° C. for further studies.

b) Cloning of 16S rRNA of Strain 4G-1

A pair of universal primers of 16S rRNA (primer 1:5′ AGAGTTTG ACATGGCTCAG 3′ (SEQ ID NO:4) and primer 2:5′ TACGGTTACCTTGTTACGACTT 3′ (SEQ ID NO:5)) is synthesized. The PCR amplification reaction is run in the Robocycler 40 (Stratagene) using the primers. The reaction system is: 1 μl of total DNA of strain 4G-1 as template, 5 μl of buffer, 4 μl of 10 μmol dNTP, 1 μl of 20 pmol/μl primer 1, 1 μl of 20 pmol/μprimer 2 and 37 μl of deionized water. The reaction condition is: 94° C. 10 min, with 1 μl of 5U Pyrobest Taq DNA polymerase added, then 94° C. 1 min, 50° C. 1 min, 72° C. 2 min for 30 circles, and finally extended at 72° C. for another 10 min. A fragment of about 1.5 kb is obtained. The PCR product is purified according to the method provide by the manufacturer of the PCR product purification kit, Boehringer.

The purified PCR product is subjected to poly-A (deoxyadenosine) reaction. The reaction system is: 20 μl (2 μg) of purified PCR product, 5 μl of buffer, 1 μl (5U) of Taq DNA polymerase (DingGuo Ltd, Beijing), 4 μl of 5 μmol dATP and 20 μl of deionized water.

The resulting products are purified by a purification method for PCR product, and are ligated to a T vector (Takara, Dalian) according to the instruction of manufacturer Takara to obtain plasmid pKU2000. The result of sequencing shows in SEQ ID NO: 1. Using BLAST software and BLASTP 2.2.2 [Dec. 14-2001] database for sequence alignment in the American National Center for Biotechnology Information (NCBI), it is found that said sequence is 99% homologous to the nucleotide sequence of 16S rRNA of Pseudomonas putida. Thus, strain 4G-1 is thought to be P. putida, and is designated as P. putida 4G-1 (abbreviated as P. P4G-1). Said strain is deposited on Apr. 30, 2002 at the Chinese General Microbiological Culture Centre (CGMCC) (ZhongGuan Cun, Beijing, China) with the accession number CGMCC 0739.

Example 3 Construction of the Genomic Library of Strain 4G-1

a) Maxi-prep of the Total DNA of 4G-1

The strain P. P4G-1 is inoculated into 100 ml of LB medium (supplemented with 100 mg/L Ampicillin) in a 250 ml flask, and cultured at 28° C. overnight while shaking at 200 rpm. The culture is centrifuged at 8000 rpm for 5 min, then the pellet is resuspended in 14 ml of Solution I. Protease K (Merck, Germany) and SDS are added to a final concentration of 10 μg/ml and 0.5%, respectively. The mixture is mixed by careful reversion and left at 50° C. for more than 6 hr. An equal volume of phenol is added. The mixture is carefully reverted and left at room temperature for 10 min. The mixture is centrifuged in room temperature at 4000 rpm for 20 min. The supematant aqueous phase is-drawn with wide-end tips, and extracted with equal volume of phenol/chloroform. The supematant is added with 10% of 3M NaAC (pH 5.5) and 2.3 volume of ethanol for precipitation. The DNA is carefully picked out using a glass rod and washed in 70% ethanol. After ethanol is discarded, the DNA is dried. 2 ml of TE solution (pH 8.0) at 4° C. is added for solubilization for 24 hr and about 1 mg of total DNA is obtained. The DNA fragment is identified to be more than 80 Kb using 0.3% agarose gel electrophoresis.

b) Recovery of the Digestive Products of Total DNA

200 μl of total DNA (100 μg) is digested with 5U of restriction endonuclease Sau3AI at room temperature for 20 min, 30 min and 45 min, respectively. The products are combined and added with EDTA to a final concentration of 0.25 mM, and then extracted with equal volume of phenol/chloroform. The supematant is added with 10% of 3M NaAC and 2.3 volume of ethanol for precipitation. The precipitate is washed with 70% ethanol, and dried as mentioned above. Then the precipitate is solubilized in 200 μl of TE, and loaded on 12 ml of 10–40% sucrose density gradient in a Beckman sw28 ultra centrifuge tube. The samples is place in the Beckman sw28 roter at 20° C. and centrifuged at 120000 g for 18 hrs. Each fraction is collected from the top (0.5 ml) and 15 μl aliquot is analyzed by 0.3% agarose gel electrophoresis. The bands comprising the DNA of 30–40 kb are combined, added with about 2 volumes of deionized water and 7 volumes of ethanol for precipitated at −20° C. overnight, then washed with 70% ethanol, dry and solubilized in 50 μl of TE.

c) Construction of Genomic Library of 4G-1

SuperCos1 cosmid vector is digested with Xba I, alkaline phosphatase and BamHI, then ligated with the isolated total DNA fragments described above, using the method of Stratagne.

The package extract of Gigapack III Gold is used to pack a ligation product in vitro. The library is titrated with a Stratagene kit on E.coli XL1-Blue MR (Stratagene), using the method of Stratagne. The obtained library is then amplified and stored according to the instruction of the same manufacturer.

Example 4 Isolation, Screening, Sequencing and Analysis of a Glyphosate-Tolerant EPSPS Gene

a) Screen of the Glyphosate-Tolerant Gene Library

1 ml stock solution of the library described above is centrifuged, the supematant is discarded and the pellet is resuspend in 1 ml sterile saline. Centrifuge is repeated again and the supematant is discarded, the pellet is resuspend in 1 ml sterile saline. The suspension is spreaded onto the 10 mM glyphosate-50 mg/L Ampicillin plate (20 mM ammonium sulfate; 0.4% glucose; 10 mM glyphosate; 0.5 mM dipotassium hydrogen phosphate; 0.1 mg/L ferric sulfate; 0.5 g/L magnesium sulfate; 0.5 g/L calcium chloride; 2.1 g/L sodium chloride; 50 mM Tris (pH 7.2); 5 mg/L Vitamin B1; 15 g/L agarose) in a density of about 10³ bacteria/plate. The plate is cultured overnight at 37° C. One strain is obtained and designated as BDS. The cosmid carried by the strain is designated as pKU2001.

b) Isolation of the Glyphosate-Tolerant Gene

The strain BDS is inoculated in 20 ml of LB (supplemented with 50 mg/L Ampicillin) in a 50 ml flask, and cultured at 37° C. for 12 hr while shaking at 300 rpm. The strain is collected after centrifugation. The plasmid pKU2001 is extracted by an alkaline method according to Molecullar cloning, laboratory manual, supra. This plasmid is then transformed into E.coli XL1-Blue MR (Stratagene) along with a cosmid vector. The strains are streaked on 10 mM glyphosate plate. Those merely carrying the empty cosmid vector is not able to grown on the glyphosate medium, while those carrying pKU2001 grows well, indicating that pKU2001 carries the glyphosate-tolerant gene.

The pKU2001 is digested with Sau3AI. A DNA fragment of 2–4 kb is recovered using 0.7% agarose gel electrophoresis, and ligated to the BamH I-digested and dephosphorylated pUC18 vector (Yanisch-Perron, C., Vieria, J. and Messing, J. 1985, Gene 33:103–119). The ligation is transformed into E.coli XL1-Blue MR (Stratagene) and the strain is streaked on the 10 mM glyphosate plate supplemented with 50 mg/L Ampicillin. The plates are cultured overnight at 37° C. and dozens of clones are obtained. The clones are picked up and plasmids are extracted. A plasmid carrying an exogenous fragment of about 2 kb is screened out and designated as pKU2002. The plasmid pKU2002 and the empty vector pUC18 are transformed into E.coli XL1-Blue MR (Stratagene) respectively. Strains are streaked on the 10 mM glyphosate plates with 50 μg/ml Ampicillin. Those carrying pUC18 empty vector are not able to grow on the glyphosate plate, while those carrying pKU2002 plasmid grows well, indicating that the plasmid pKU2002 carries glyphosate resistant gene.

c) pKU2002 is sequenced, and a full-length sequence of 1914 bp is obtained as shown in SEQ ID NO:2.

d) pKU2002 is subjected to sequence analysis using DNASIS software. The unique possible open reading frame (ORF, nucleotides 574–1803 of SEQ ID NO:2) is determined. The amino acid sequence encoded is shown in SEQ ID NO:3.

e) The protein sequence is subjected to BLAST search in the GenBank protein sequence database of American National Center for Biotechnology Information (NCBI). It is found that the sequence of the protein is 37% homologous to the amino acid sequence of EPSPS of Clostridium acetobutylicum, and is 20% homologous to the amino acid sequence of EPSPS of E.coli. The 1230 bp sequence is thought to encode an EPSP synthase, and the gene is designated as pparoA. Analysis shows that said gene does not belong to any class of EPSPS, it is a novel EPSPS gene (Class III). The EPSPS amino acid sequence alignment of E.coli, Clostridium acetobutylicum and P. P4G-1 is shown in FIG. 2.

f) A pair of primers is designed comprising a BamH I site shown as underlined:

Primer 3: 5′-CGGGATCCTAAGTAAGTGAAAGTAACAATACAGC-3′ (SEQ ID NO:6) Primer 4: 5′-CGGGATCCCTTCTTCGGACAATGACAGAC-3′ (SEQ ID NO:7)

The PCR amplification is run with pKU2001 the template. The amplified fragment is digested with BamHI and inserted into pUC18 to obtain plasmid pKU2003. Sequencing shows that no mismatching base is introduced. pKU2003 is digested with BamHI and ligated into the BamHI site of pACYC184 in a forward direction (Chang, A. C. Y., and Cohen, S. N., 1978, J. Bacteriol. 134: 1141–1156) to obtain plasmid pKU2004, the map of which is shown in FIG. 1. The transcription of pparoA gene in this plasmid is initiated by the promoter Tc^(r) derived from pACYC184.

Example 5

Cloning of E.coli aroA Gene and Site-Directed Mutagenesis of its Glyphosate Tolerance (Control Test)

The E.coli ET8000 (MacNeil, T., MacNeil, D., and Tyler, B. 1982 J. Bacteriol. 150: 1302–1313) is inoculated into 3 ml of LB liquid medium in a 15 ml tube, and cultured with shaking at 37° C. overnight. The strain is centrifuged, and total DNA is extracted according to the method described above.

A pair of primers is designed to include a BamHI site shown as underlined:

Primer 5: 5′-CGGGATCCGTTAATGCCGAAATTTTGCTTAATC-3′ (SEQ ID NO:8) Primer 6: 5′-CGGGATCCAGGTCCGAAAAAAAACGCCGAC-3′ (SEQ ID NO:9)

The E.coli aroA gene which encodes EPSPS protein in E.coli is obtained by amplification using the total DNA of E.coli as template. Said gene is digested with BamHI and inserted into pUC18 to obtain the plasmid pKU2005. The sequence of the plasmid is analyzed and SEQ ID NO:10 is obtained. The sequence is proved to be correct after alignment with the known EPSPS gene sequence of E.coli in the GenBank data of NCBI. After digesting the plasmid pKU2005 with BamHI the small fragment is recovered and inserted in forward direction into the BamHI site of pACYC184, and the plasmid pKU2006 is obtained.

The aroA gene of E.coli is subjected to site mutation. The Guanine on site 287 is mutated to Cytosine. Then the Glycine on site 96 of the E.coli EPSPS protein is mutated to Alanine. Similarly, said gene fragment is inserted into the BamHI site of pACYC184 to obtain plasmid pKU2007.

Example 6 The EPSPS Function-Complementation Experiment of E.coli aroA^(−Strain)

pACYC184, pKU2004, pKU2006 and pKU2007 are transformed into E.coli AB2889 (E.coli aroA⁻ strain, from Yale University) respectively. They are streaked on M63 medium (13.6 g/L KH₂PO₄, 0.5 mg/L FeSO₄-7H₂O, 20 mM (NH₄)₂SO₄, 0.4% glucose, 1 mM magnesium sulfate, 0.5 mg/L Vitamin B1) comprising chloroamphenicol in a final concentration of 25 mg/L for culture. The results are shown in Table 1.

The aAAS components are supplemented as follows:

100 mg/L Phenylanine 100 mg/L Tyrosine 100 mg/L Tryptophane 5 mg/L p-aminobenzoic acid 5 mg/L 2,3-dihydroxybenzoic acid 5 mg/L p-hydroxybenzoic acid

TABLE 1 The experiments of EPSPS function-complementation and glyphosate tolerance of aroA-deficient E. coli strain the plasmid EPSPS function-complementation and glyphosate tolerance carried M63 medium 10 mM glyphosate by AB2889 M63 medium (supp. aAAS) tolerance pACYC184 − + − PKU2006 + + − pKU2007 + + + pKU2004 + + +

At the same time, the growth curves of the strains are measured in liquid culture condition. The results show that, as same as the control aroA gene of E.coli (pKU2006), the gene carried by pKU2004 is able to completely complement the EPSPS function of aroA deficient E.coli AB2899, suggesting that the 1230-bp nucleic acid sequence carried by said plasmid is a EPSPS encoding gene, and the EPSPS encoded by said gene has glyphosate tolerance.

Example 7 The Glyphosate Tolerance of the Novel EPSPS Gene

The plasmids pKU2004, pKU2006 and pKU2007 are transformed into the E.coli XL1-Blue MR, respectively. Stains are inoculated and cultured overnight on M63 mediums separately, and then transferred to M63 mediums supplemented with different concentrations of glyphosate. The growth curves are measured. The results show that the E.coli transformed with pKU2006 is inhibited significantly when growing in the 5 mM glyphosate medium, and does not grow in the 40 mM glyphosate medium. In constrast, the E.coli transformed with pKU2004 and pKU2006 are not inhibited obviously when growing in the 40 mM glyphosate medium, and the E.coli transformed with pKU2004 grows well in 120 mM glyphosate medium (FIG. 3: growth curve). 

1. An isolated nucleic acid encoding a glyphosate tolerant 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS), wherein the synthase is characterized by that it: (i) comprises the amino acid sequence shown in SEQ ID NO:3, or (ii) is the glyphosate tolerant 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) which is modified by substitution, deletion or addition of one or more amino acids to the amino acid sequence of (i) and is at least 95% identical to SEQ ID NO:
 3. 2. The isolated nucleic acid according to claim 1, characterized by that it: (i) comprises the sequence shown in nucleotides 574–1803 of SEQ ID NO:2, or (ii) comprises the nucleotide sequence which is modified by substitution of one or more nucleotides, or the deletion or addition of three or a multiple of three nucleotides to the nucleotide sequence of (i), and the protein encoded by said nucleotide sequence has the activity of glyphosate tolerant 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) and the glyphosate tolerance.
 3. A nucleic acid construct, which comprises the isolated nucleic acid according to claim
 1. 4. A vector, which carries the isolated nucleic acid according to claim
 1. 5. A host cell, which is transformed by the nucleic acid construct according to claim
 3. 6. A host cell or progeny cell hereof, wherein said cell contains the isolated nucleic acid according to claim 1 and has the glyphosate tolerance.
 7. A host cell or progeny cell hereof, wherein said cell contains the isolated nucleic acid according to claim 2 and has the glyphosate tolerance.
 8. A method for preparing a host cell, comprising operably linking the nucleic acid of claim 1 with appropriate control sequences, introducing them into an appropriate vector, introducing said vector into a selected host cell, and expressing an active glyphosate tolerant 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) encoded by the nucleic acid sequence.
 9. A nucleic acid construct, which comprises the isolated nucleic acid according to claim
 2. 10. A vector, which carries the isolated nucleic acid according to claim
 2. 11. A vector, which carries the nucleic acid construct according to claim
 3. 12. A vector, which carries the nucleic acid construct according to claim
 9. 13. A host cell, which is transformed by the nucleic acid construct according to claim
 9. 14. A host cell, which is transformed by the nucleic acid construct according to claim
 4. 15. A host cell, which is transformed by the nucleic acid construct according to claim
 10. 16. A host cell, which is transformed by the nucleic acid construct according to claim
 11. 17. A host cell, which is transformed by the nucleic acid construct according to claim
 12. 18. The isolated nucleic acid according to claim 1, wherein the nucleic acid encodes a polypeptide that is at least 96% identical to SEQ ID NO:
 3. 19. The isolated nucleic acid according to claim 2, characterized by that it: (i) comprises the nucleotide sequence shown in SEQ ID NO:2, or (ii) comprises the nucleotide sequence which is modified by substitution of one or more nucleotides, or the deletion or addition of three or a multiple of three nucleotides to the nucleotide sequence of (i), and the protein encoded by said nucleotide sequence has the activity of glyphosate tolerant 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) and the glyphosate tolerance.
 20. The isolated nucleic acid according to claim 2, wherein the nucleic acid is at least 65% identical to nucleotides 574–1803 of SEQ ID NO:
 2. 21. The isolated nucleic acid according to claim 20, wherein the nucleic acid is at least 95% identical to nucleotides 574–1803 of SEQ ID NO:
 2. 22. The isolated nucleic acid according to claim 2, wherein the nucleic acid is at least 65% identical to SEQ ID NO:
 2. 23. The isolated nucleic acid according to claim 22, wherein the nucleic acid is at least 95% identical to SEQ ID NO:
 2. 